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1. INTRODUCTION 

In a data-driven world we live in today, technology advancement has made it possible to process vast 
amount of data for analysing pattern of events occurrences. These data is used in different localities, for 
instance, in constructing a predictive data model for weather and environmental events forecasting, traffic 
analysis, sentiment analysis, and many other applications. Development of hardware and software for data- 
driven technology is limitless. An example of applied data-driven technology has been published in Kasabov 
et al. [1] where data analysis has been applied on ecological data, health-related data, aphid population 
prediction, and others. Data-driven technology development and application has also been applied on a 
neuromorphic hardware [1] to better-visualise the constructed predictive models. 

Kasabov [2, 3] has addressed that most environment-related event can be captured to form 
spatio/spectro-temporal data (SSTD) by capturing data from different places (spatio-component or localities) 
and their corresponding variables timely measurements. Several works have also been published on ecological 
data [4, 5], stroke data [1, 4, 6-9], and learning of spatio-temporal brain data [10, 11], demonstrating the 
capability of a computational predictive method for extraction of knowledge from SSTD. 

Analysing and understanding SSTD is considered as a challenging task due to the close interaction 
and interrelationship between the spatial/spectral-components with the temporal-components [4], the non- 
suitability of a conventional machine learning to be used for classification of SSTD (only suitable for 
classifying vector-based and static type of data) [10], and the training samples and testing samples must have 
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the same number of input features-hence eliminating some input to produce consistent number of features 
causes information loss [4]. 

An investigation on natural events requires analysis on complex high dimensional data in the form of 
spatio- and spectro-temporal data (SSTD) and it is important to retain the interrelationship information between 
the components. Data mining tools provides a promising framework for discovering hidden patterns within a 
multi-dimensional dataset, allowing prediction of occurrences-applicable in various domains like prediction of 
environmental disaster occurrences, risk of stroke [1, 3, 4, 6-9], assessment of ecological data [5, 12, 13] 
prediction of unknown gene functions [14], and many others [7, 15-17]. 

In ecological and river engineering aspects, Ghani et al. [18, 19] have summarised that flood is caused 
by meteorological factors (climate, as well as duration and intensity of rainfall), geological features, and 
urbanisation process. Sinnakaudan et al. [20] have stated that floods, apart of being caused by the random 
coincidence of several meteorological factors by nature-the severity and consequences of the events are also 
influenced by man’s use of the river’s catchment. Theoretically, it is possible to accurately predict 
environmental events such as flood cases by analysing SSTD present in the environmental historical data. 

Therefore, this paper presents an approach for solving classification problems, adaptively applied for 
assessment of flood risks by using evolving Spiking Neural Networks methods. The subject involves 5 years 
environmental-related temporal data (from 2012 until 2016) of Kuala Krai in Kelantan, Malaysia. This article 
assumes less prior knowledge on data modelling techniques and neural network generations. 


Data Modelling Techniques 

Global Modelling—defined by a single function for the whole problem space [2, 21], global modelling 
is applied in algorithms such as in Support Vector Machine (SVM) and Multi-Layer Perceptron (MLP). SVM 
consists of a kernel function, basically an equation which divides data vectors into different classes based on 
which area the data vector falls on [2, 21]. A model created using global modelling can be easily applied to 
new data, however, additional knowledge regarding the data such as the nature of the data and knowledge- 
database is naturally neglected in global modelling which causes information loss [3, 12]. Due to this 
characteristic of global modelling, it is not suitable to be used to analyse the dynamic and 

Local Modelling—introduced to solve problems of global modelling, where local modelling is more 
adaptable to new data vectors [2, 21]. According to Kasabov [3], local modelling is introduced by forming 
subsets of the whole problem space, hence a whole problem space is represented by multiple subsets called 
classes. Local modelling algorithms include K-means, Self-Organizing Maps (SOM), Fuzzy Clustering, and 
Hierarchical Clustering 

Personalised Modelling—created based on a single point from a subset of the whole problem space, 
in which every new data vectors can be classified into their corresponding classes based on the model which 
is constructed “on the fly” [3, 12, 22]. K-Nearest Neighbor (K-NN) is a modelling technique which for every 
new samples, the nearest K samples are derived from the data set using Euclidean distance measure and a 
personalized voting which then labels the samples to its appropriate cluster [3]. 

Table 1 summarizes the comparison between global, local, and personalised modelling techniques 
which highlights on the important criteria such as problem set covered, benefits, and limitations. 


Table 1. Comparison between global, local and personalised modelling techniques 








Techniques Global Modelling Local Modelling Personalised Modelling 
Problem Set Entire problem space A cluster from the entire A single point from the a cluster in 
problem space the entire problem space 
Reasoning Theory Inductive reasoning Transductive reasoning Transductive reasoning 
Applicable SVM K-means, SOM, Fuzzy K-NN, WK-NN, WWK-NN 
Algorithms Clustering, Hierarchical 
Clustering 
Benefits Provide overview (big Easier to adapt new data and New input data can be labeled to 
picture) of the knowledge provides better explanation its corresponding cluster on the fly 
without knowing the for individual cases 
details 
Limitations Offers limited knowledge Requires knowledge on Employs much more complex 
to be extracted from the number of available clusters algorithm than global and local 
output data; difficult to via input or cluster modelling 
adapt to new data initialization; thus cannot 


construct model on the fly 





Neural Networks Generations 
The First Generation of Neural Networks. Considering computational units are used to define the 
generations of neural networks, the networks employing the McCulloh-Pitts neurons (known as perceptrons) 
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is the first generation of neural networks. The neuron body is able to make calculations based on multiple 
inputs, producing a single output as such inputs are x and y, neuron body processes z=x+y, and output is z. It 
is capable to solve simplex problems, but does not to provide conditioning statements to the produced output. 

The Second Generation of Neural Networks. The idea behind this is to integrate an activation function. 
Each neuron receives multiple inputs, processed by the neuron function, and as the output value from the 
neuron surpasses the activation function threshold, an output will be produced. Various activation function can 
be integerated into the neuron includes sigmoid, Rectified Linear Unit (ReLU), and linear activation functions. 

The Third Generation of Neural Networks. Spiking neural network (SNN), employing spiking neurons 
(also known as integrate-and-fire neurons) where a spike. Signal is emitted upon reaching a spike threshold. 
Further details is described in Section 1.3. 


Spiking Neural Networks 

The brain deals extremely well as a spatio-temporal information processing, where when presented 
with information, complex spatio-temporal paths and patterns are formed across the brain [4]. This has 
motivated researchers to create spatio-temporal data machine (STDM) for processing SSTD information based 
on the brain physiology. By imitating the behaviour of the brain analysing data as close as possible, a data 
processing model can be constructed—which can be later on used to process new upcoming stream of data. 

Inspired by the brain biological mechanism, spiking neural networks has been introduced and proven 
to be more powerful than conventional neuron network models [4, 23]. The idea is to apply a spiking neuron 
activation function on each individual neurons in a network—later on, used to measure the association strength 
between different SSTD components. A network of spiking neurons with spatial memory are capable of 
encoding, store, recognize, and recall spatial information patterns [1, 10], therefore offers potential to create a 
spatial memory model for analyzing spike patterns from a data stream. 

Spiking neural networks have shown many possible implementations in various fields, as featured in 
Kasabov et al. [1, 4, 6]-[8, 11, 24]. For instance, a study [24] has been conducted using spiking neural network 
architecture to understand the functional changes in the brain for opiate dependent treatments. NeuCube 
EvoSpike architecture [4], has been developed to recognize brain signals pattern for integration with 
neuromorphic cognitive systems. 


2. METHODOLOGY 

Environmentally related data collected is first prepared into a format consisting variables and their 
corresponding measurements (in this case, the reading is acuired on daily basis). The Kuala Krai real-world 
reading includes measurements of daily rainfall (mm), monthly rainfall (mm), average daily temperature (°C) 
wind speed (ms"!), The last column of the measurement is labelled with classes, either ‘High-Risk’ or ‘Low- 
Risk’, to be used for supervised learning. 

Figure | visualises the spiking neural network architecture for analysing the environmental data. First, 
SSTD formatted data is encoded into input neurons which is then loaded into a three-dimensional data array 
acting as SNN reservoir. The environmental data is then trained using the network, where the SNN classifier 
later on classify the resulting data into ‘High-Risk’ and ‘Low-Risk’ class, determining the chances of flooding 
in a certain area. The network is continuously optimised and corrected using new loaded environmental data, 
and an optimised model is then saved upon achieving a satisfactory result. The created model can be 
continuously applied to further train and improve the accuracy of the network using new data. 

The data in form of SSTD were encoded [11, 25] into input neurons, each represents an 
environmental-related feature extracted from the data source. Next, the features are stochastically transmitted 
among spatially distributed neurons (representing spatial component) on a three-dimensional network of 
spiking neurons-this creates a simple model of spiking neural network for analysing the input data. Each 
encoding process locates the input neurons to different positions, providing different chances of association 
between input neurons. 
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Figure 1. A spiking neural network architecture for capturing SSTD patterns 


2.1. Data Description 

Dataset used covers historical environmental data from 2012-2016 (5 years) provided by the Malaysia 
Meteorological Department. The first dimension is the temporal (time) dimension and the second dimesion is 
the reading of 6 features, measured once daily for 5 years based on Kuala Krai station; (i) water level in cm, 
(ii) daily rainfall in mm, (iii) monthly rainfall in mm, (iv) wind speed in ms", (v) air humidity in percentage, 
and (vi) temperature in degree Celcius. A “Class” label is pre-defined in an additional column for supervised 
learning mode and the value is determined by water level of a river cross-section as determined by Malaysian 
Metreological Department water level manual. Data is classified with the label ‘High-Risk’ if water level value 
surpasses a certain threshold value (determined in the manual), otherwise ‘Low-Risk’. 

Visualised in Figure 2, ‘High-Risk’ samples were created with 14-days time length; day-14 until day- 
1 where measurement of the 6-features patterns gradually changes, approaching flood on day-0. Low-Risk 
samples were created by selecting a total of 14-days reading where there reading shows absoulute no flood, in 
which, to be used as a control class. 


Day 0 
(Flood) 


Transition Period 





Day 12, 11,10...1 


Figure 2. Temporal data is fed backwards, from day-12, day-11, and so forth approaching flood day on day-0 


Approaching to flood occurrence day, features measurement changes are expected in pattern such: 
water level increase, daily rainfall measurements increase, air humidity measurements increase; and 
temperature measurements decrease. Hidden patterns may be formed by monthly rainfall and wind speed 
reading. Spiking neural network is used to analyse and learn the hidden spike patterns from the environment 
data and reason about the variables compound which in total, can be used to estimate and label risk of flooding. 
The correlation between the features forms a model for processing and labelling the new incoming input data 
based on previousy trained data via supervised and unsupervised learning. 

Over the five years, 7 flood occurrences have been observed, enabling 7 samples to be formed to train 
the network. Each sample consists of ‘High-Risk’ and ‘Low-Risk’ for a flood case study with a time length of 
12-days for reading changes pattern observation. 


2.2. Experimental Procedures 
Experiments are conducted in such a way the data (pattern) model is first constructed, then simulated 
in two phases; (i) training and (ii) testing. The simulation has been conducted using several algorithms 
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including multilayer perceptron (MLP) for Experiment 1, multiclass classification for experiment 2, and SNN 
architecture which encorporates wkNN classifier for experiment 3. Experiments were conducted by using the 
created samples and experiments were conducted to assess the risk of flood for 1-day earlier and 3-days earlier 
by using samples time length split as specified in Table 2. 

During experiments, the data in form of SSTD were encoded into input neurons, each represents an 
environmental-related feature extracted from the data source. Next, the features are stochastically transmitted 
among spatially distributed neurons (representing spatial component) on a three-dimensional network of 
spiking neurons-this creates a simple model of spiking neural network for analysing the input data. Each 
encoding process locates the input neurons to different positions, providing different chances of association 
between input neurons, in which, an optimal mapping of the input neurons in the spiking neural network cube 
creates the optimal model for validating the variables associations. 

The output from the neural network cube is transmitted to a SNN classifier which classifies the result 
into ‘High-Risk’ and ‘Low-Risk’ for flooding based on the patterns constructed using previous days. The 
optimal model can be applied to new data for prediction of risk of flooding. 

In order to achieve the best accuracy possible with less time and resource consumption, the 
experiments execution parameters have been optimized by using Grid Search Optimization algorithm, where 
set of limits have been introduced to the experiment execution parameters. Tuning to find the most suitable 
parameter without affecting the fair comparison between experimental execution has been prepared using 
configuration as presented in Table 3. 


Table 2. Training and testing samples of different time length are used to feed the network with 
environmental data for an early prediction of flood risk 








Sample Type Data Split Percentage of Early Day Predict 
Train:Test Trained Data/All (%) 
1 Day Early Sample 12:12 100.00 1 
3 Days Early Sample 10:12 83.30 3 





Table 3. Optimised parameters for experiment execution 





Parameter Upper Limit Optimisation Lower Limit Optimisation Optimal Value 
AER Threshold 1.0 0.1 0.5 
Small World Radius 1.0 5.0 2.5 
STDP Rate 0.01 0.10 0.01 
Firing Threshold 0.1 1.0 0.47 
Refractory Time 1 10 2 or7 
Time Rounds 1 10 4or5 
deSNN Mod 0.01 0.50 0.157 
deSNN Drift 0.01 0.50 0.329 





3. RESULTS AND DISCUSSION 

The experiment has been executed using three different algorithms; experiment | has been executed 
using MLP algorithm, experiment 2 has been executed using Multiclass Classification algorithm, and 
experiment 3 has been executed using SNN with wkNN. The result is produced by comparing the correctness 
of the data classified by the network after undergoing supervised training and executed on unsupervised 
training based in Train:Test data split of 12:12 for 1-day earlier prediction and 10:12 for 3-days 
earlier prediction. 

The result in Figure 3 shows that the overall accuracy of flood risk assessment produced by MLP, 
multiclass classifier, and SNN architecture wkKNN. In general, the SNN with wkNN classifier has been trained 
to create a personalised model for assessing flood risk, while parameter optimisation over the model has 
assisted in producing the highest accuracy in assessing flood risk as compared to other algorithms. 

In this respect, a more radical approach has been taken to (i) create a personalised data model and (ii) 
to create a data samples suitable to be used for generating the model for assessing the risk of flood. Most 
conventional global modelling techniques are able to classify result with high accuracy, but at the same time 
lacks the capability to consider temporal components into modelling the data model (for observing occurrences 
patterns and introduce the relativity between variables). Comparison between result achieved by conventional 
global modelling techniques with the SNN with wkNN classifier has demonstrated that personalised modelling 
performs better to produce a result with high accuracy by considering variables change patterns over 
temporal component. 
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Figure 3. Result of flood case risk assessment for Kuala Krai for year 2012-2016 (SY) 


Noticeably, SNN with wkNN has produced a classification result with the highest accuracy as 
compared to MLP and multiclass classifier for both earlier 1-day and 3-days prediction of risk of flood. This 
simulation proves that personalised modelling is capable of producing a better classification result for a specific 
use case as compared to other data modelling techniques. It should be clarified that for flood case in Malaysia, 
early prediction is possible to be made up to 3-days early since flooding most commonly occurs due to 
continuous raining for few days. Observation in changes in environmental variables patterns using SNN has 
made the result possible. 


4. CONCLUSION 

The paper has presented an evolving Spiking Neural Networks methods for classification problems 
with a case study of real-world flood events risk assessment. The result of classification using SNN with wKNN 
has shown a significant increase in accuracy as compared to the conventional MLP and multiclass classifier 
due to the capability of the personalised model to represent a specific case study rather than the general (global) 
model. Personalised models can be created to represent each region of an area, where each region can be 
individually represented and analysed. 

The outcome of the research may be applied in various ways including development of an alert system 
to predict environmental disaster such as flood occurrences. Development of such technology could help to 
reduce the risk of accidental death and property losses, by providing an earlier alert to be broadcast to a high 
risk area. In this case, the output value can be a classification value or a regression value where the output value 
is the reading of river water level. Another possible application of this research is to be integrated into a flood 
disaster management decision support system for property valuation in flood affected area; the system can 
analyse contributing factors in assessing the value of property depending on the risk prediction of the flood 
event. 
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